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Abstract 

In this paper nonparametric methods to assess the multivariate Levy measure 
are introduced. Starting from high-frequency observations of a Levy process X, we 
construct estimators for its tail integrals and the Pareto Levy copula and prove weak 
convergence of these estimators in certain function spaces. Given n observations of 

— 1/2 

increments over intervals of length A„, the rate of convergence is k n for k n = nA„ 
which is natural concerning inference on the Levy measure. Analytic properties of the 
Pareto Levy copula which, to the best of our knowledge, have not been mentioned 
before in the literature are provided as well. We conclude with a short simulation 
study on the performance of our estimators. 
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1 Introduction 



The modeling and estimation of dependencies is attracting an increasing attention over 
the last decades in various fields of science like mathematical finance, actuarial science or 
hydrology, among others. 

In discrete time models, one of the most popular approaches is the concept of copulas 
which allows to separate the effects of dependence of a random vector from its univariate 
marginal behavior. In the bivariate case, the copula of a continuous random vector (X, Y) 
is the unique function C : [0, l] 2 — > [0, 1] for which 

F[X <x,Y <y] = C(F[X < x],F[Y < y]). 

This formula, known as Sklar's Theorem, is usually interpreted in the way that the copula 
completely characterizes the stochastic dependence between X and Y and hence represents 
the primary object of interest for investigating dependencies. For introductions to the 



concept of copulas in the aforementioned fields of science see McNeil et al. (2005), Frees 



and Valdez (1998), Genest and Favre (2007) and references therein. The books of Joe 



(1997) and Nelsen (2006) provide compendiums on the mathematical background and on 



various parametric models. The huge amount of applications gave rise to a great demand 
for statistical methods, of which semi- and nonparametric estimation in discrete time i.i.d. 



models has been investigated in Genest et al. (1995), Fermanian et al. (2004) and Segers 



(2011). Nonparametric generalizations to the case of serially dependent stationary time 



series have recently been considered in Biicher and Volgushev (2011). 



On the other hand, a huge amount of models in applied stochastics relies on an under- 
lying process which is defined in continuous time. A basic tool in this framework is the 
class of (multidimensional) Levy processes which provides a flexible way to model empir- 
ically observed behaviour and includes prime examples such as Brownian motion and the 
(compound) Poisson process. Statistical methods in this context depend on the nature of 
the observation schemes which are usually classified as high frequency and low frequency 
setups. In both areas the literature on nonparametrics has grown considerably over the 



last decade. To mention only a few approaches we refer to Jacod (2007) and Figueroa- 



Lopez (2009) for the case of high frequency observations, whereas seminal papers in the 



(2012) 



low frequency setting are due to Neumann and ReiB (2009) and recently to Nickl and Reifi 



Our aim in this work is to combine both strands of the literature and to provide non- 
parametric methods to estimate the dependence structure of a multivariate Levy process. 
For the sake of brevity we will concentrate on the bivariate case solely, but extensions 
to the general (i-dimensional setting are straightforward to obtain as well. Thus, let 
X = (X^\ X^) be a two-dimensional Levy process with Levy- ltd decomposition 



X t = at + B t + / / u * (u. — fi)(ds, du) + / / u~k/j,(ds,du 
Jo J\\u\\<i Jo J\\u\\>l 



XI) 



where a 6 M 2 is a drift vector, B is a bivariate Brownian motion with some covariance 
matrix S, and \i and p, are the jump measure of the Levy process and its compensator, 
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respectively. It is well-known that the compensator takes the form fi(ds,du) = ds u(du), 
where v is the so called Levy measure of X. Given the choice of the truncation function 
h(u) = l{|| u j|>i}, the law of X is uniquely determined by the Levy triplet (a, E, u). 

As noted above, in the framework of statistics for stochastic processes it is inevitable 
to lose some words on the underlying observation scheme. We decide to work in a high 
frequency setting which means in the simplest case that at stage n one is able to observe 
one realization of the process X at the equidistant times iA n , i = 0, . . . , n, for a mesh 
A n — > 0. An outlook on extensions to a more general setup including irregularly spaced 
data and asynchronous observations will be provided in a concluding section at the end 
of the paper. Within the class of high frequency settings a further distinction regards the 
nature of the covered time horizon. Usually we have either nA n = T, corresponding to 
a finite time horizon (a trading day, say), whereas nA n — >■ oo means that the process is 
eventually observed on the entire time span [0, oo). 

Due to the independence of the continuous part and the jump part of a Levy process, 
the analysis of the stochastic nature of X canonically splits into inference on the covariance 
matrix £ and inference on the jump measure v, since no joint contribution of the two 
components is involved. However, estimation of the characteristics of the Brownian part 
of X with our without additional jumps is well understood in the high frequency setup 
(among others, see Jacod (2008) for a thorough theory on the behaviour of more general 
Ito semimartingales), so our focus in this paper will be on the jump dependence of the 
two components. In analogy to standard copulas for random vectors we will employ a 
concept of a Levy copula to capture the dependence structure within v which dates back 



to 


Cont and Tankov 


( 


2004 


) and 


Kallsen and TankoA 


r (200f 


))• We will follow a slightly 


different approach due to 


Kliippelberg and Resnick 


(2008) 


and 


Eder and Kliippelberg 



(2012), however, and focus on nonparametric methods to assess the closely related Pareto 
Levy copula. 



Besides parametric approaches to infer the (Pareto) Levy copula such as Esmaeili and 



Kliippelberg (2011), nonparametric methods in this area are hardly available. To the best 



of our knowledge, the only concept is due to the unpublished work of Laeven (2011) who 
constructs an estimator for the Levy copula based on a representation in the limit involving 
ordinary copulas and provides some asymptotic properties, but for which no explicit proof 
is available. On the other hand, since the (Pareto) Levy copula captures the tendency 
of the process to have joint (largely negative) jumps, the need for reliable nonparametric 
estimators is evident from practice, particularly with a view on finance. This convinces 
us that there is a clear gap in the literature which we aim to fill in this work. 

In contrast to Laeven's method, our approach will be based directly on the defining 
relation of the Pareto Levy copula T which involves tail integrals of both the Levy measure 
and its marginals. For simplicity, we will focus on the spectrally positive case only, that 
is we assume that X has only positive jumps in both directions, or equivalently that the 
Levy measure v has support on [0, oo) 2 \{(0, 0)}. T will then naturally be a function on the 
same space. In the case where all tail integrals are continuous, we obtain a representation 
of r as a functional of those, and we propose to estimate T by using appropriate estimators 
for the tail integrals. It turns out that in order to do so, we are forced to work in the high 
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frequency setting with infinite time horizon, that is nA n —> oo. Under some rather mild 
assumptions we are then able to prove weak convergence of a suitably standardized version 
of r — r in a certain function space, which will be our main result. As a by-product we 
obtain a Donsker theorem for the bivariate Levy measure as well, a result which is similar 
in spirit to the recent work of Nickl and Reifi (2012), but in a high-frequency setting rather 
than a low-frequency world. 

The paper is organized as follows: Section [2] is devoted to a brief discussion on jump 
dependence of bivariate Levy processes. We summarize the concept of Pareto Levy cop- 
ulas and derive some of their analytical properties. In Section [3] we define estimators for 
bivariate tail integrals, as well as for their associated Pareto Levy copulas. Weak conver- 
gence of these estimators is discussed in Section |4j A brief discussion of our results and a 
small simulation study are provided in Section [5j whereas some conclusions are given in 
Section [6| Finally, some technical results are postponed to Section [7j 



2 Jump dependence and the Pareto Levy copula 



Suppose that we are given a bivariate Levy process X of the form ( 1.1 ) where v denotes its 
Levy measure. As already stated in the introduction, one assumption will be that v has 
support on [0, oo) 2 \{(0, 0)}, which means that both components of X only have positive 
jumps. This condition is for notational convenience in first place, as we will see later that 
one can follow a similar approach in order to estimate the jump dependence in the other 
three quadrants as well. 

Let us review some recent concepts on jump dependence. The basic quantity in this 
framework is the bivariate tail integral U of u, which for the moment will be defined as a 
function from [0, oo] 2 \{(0, 0)} to M. given by 

U(x) = v([xi,oo] X [x 2 ,oo]), x=(xi,x 2 ). (2.1) 

From the theory of Levy processes it is well-known that this quantity gives the average 
amount of jumps of X which fall into the interval [xi,oo] x [3:2,00] during a time period 
of length one. Since X has cadlag paths, U(x) is necessarily finite. In the same way, we 
are able to introduce marginal tail integrals. Precisely, let JJ% : [0,oo] — > [0,oo], i = 1,2, 
be defined via 

J7i(a;i) = v([xi,oo] xR) and U 2 (x 2 ) = u(R X [x 2 , oo]). (2.2) 

Again, Ui(xi) is finite for Xj > 0, but in the infinite activity case we have t/j(0) = oo 
and since this is typically satisfied for Levy processes, we will assume such a property for 
i = 1, 2 as well. 

It is obvious that the entire information about v is contained in the tail integral U. 
Therefore, just as for regular copulas, one might be interested in splitting U into several 
functions which are related to the jump behaviour of X in the marginals (naturally given by 
the univariate tail integrals f/j) and a Levy copula C which captures the specific tendency 
of X to have joint jumps. Having this intuition in mind, Cont and Tankov provided the 
following definition. 
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Definition 2.1 A bivariate Levy copula for Levy processes with positive jumps is a func- 
tion C : [0, oo] 2 \ {(oo, oo)} — > [0, oo) which 

(i) is grounded, that is C(x,0) = C(0,x) = for all x G [0,oo]; 

(ii) has uniform margins, so C(x,oo) = C(oo,x) = x for all x G [0,oo); 

(Hi) is 2-increasing, that is C{x\, X2) — C(x\, 7/2) — C(yi, X2)+C(y±, 1/2) > for all x\ < y\ 
and x 2 < y2- 

The main result on Levy copulas is a version of the famous Sklar's theorem which 
states that for each tail integral U with marginals U\ and U2 there exists a Levy copula 
C such that 

C/(x) = C(C/i(xi), U 2 {x 2 )), x = (xi,x 2 ) G [0, oo] 2 \ {(0, 0)}, 

holds. Similarly to the usual copula, C is uniquely defined if U\ and U2 are continuous. 
Therefore continuity of U{ is a natural condition in order to secure that the concept of 
copulas is appropriate, and it becomes our third main assumption. Also, if both marginal 
tail integrals are strictly decreasing, we obtain a representation of C via 

C(u) = U(U^ 1 (u 1 ),U 2 - 1 (u 2 )), u= (m,n 2 ) G [0,oo] 2 \ {(00,00)}. (2.3) 

We will see later that some smoothness assumptions on v are necessary for estimation 
purposes from which strict monotonicity of the marginal tail integrals follows. 

The inverse statement of Sklar's theorem is true as well, which states that knowledge 
of the marginals Ui and the Levy copula C determines U completely and thus in turn v. 



A drawback of the approach of Cont and Tankov (2004) is, however, that C is not a tail 



integral - in contrast to the regular copula of a random vector which couples marginal 
distribution functions and is a bivariate distribution function itself. This circumstance 
makes the interpretation of a Levy copula quite difficult, and for that reason it appears 
to be natural to focus on an alternative notion of copula in this setting. 

Definition 2.2 A bivariate Pareto Levy copula for Levy processes with positive jumps is 
a function V : [0, oo] 2 \ {(0, 0)} — > [0, 00) which 

(i) is grounded, that is T(u, 00) = r(oo,-u) = for all u G (0,oo]; 

(ii) has Pareto margins, so T(u, 0) = T(0, u) = 1/u for all u G (0, 00]; 
(Hi) is 2-increasing. 



As usual, we set l/oo = and vice versa. Following Eder and Kliippelberg (2012), 



Sklar's theorem now reads as follows: Given U and its marginals, we have 

U(xl) = T(1/U 1 (x 1 ),1/U 2 (x2)), x= ( Xl ,X2) G [0,oo] 2 \{(0,0)} (2.4) 
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for some unique Pareto Levy copula V, and we obtain the relation 



rfu) 



{/([/r^iM),^ 1 ^)), 



u 



( Ul ,u 2 ) G [0,oo] 2 \{(0,0)}. 



(2.5) 



The difference to the approach of |Cont and Tankovj ( |2004 ) is that the marginals of T 
correspond to Pareto tails, which are the tail integrals of a 1-stable Levy process on the 
positive half line. Since T is 2-increasing as well, it is a simple task to deduce that it 
satisfies the properties of a tail integral of a spectrally positive Levy process as claimed. 
Thus the Pareto Levy copula allows for the interpretation that the marginals of v are 
standardized to the Levy measures of a 1-stable Levy process, which is similar in spirit to 
the ordinary copula concept where marginals are standardized to uniform distributions. 

Finally, we collect some basic properties of Pareto Levy copulas, some of which already 



have been stated in Cont and Tankov (2004) and Kallsen and Tankov (2006 ) in the context 
of Levy copulas. 



Proposition 2.3 Every Pareto Levy copula T has the following properties. 



(i) ('Lipschitz continuity') |T(u) -T(v)| < 



j i_ 



+ 



J 1_ 

112 V2 



(ii) (Monotonicity) T is 2-increasing and the functions T(u, ■) and T(-,u) are non- 
increasing for each fixed u > 0. 



(Hi) ('Frechet-Hoeffding bounds') r_i_ < T < IV where Tj_(u) = n 1 1 l 



*1 i{ U2 =0} 



+ 



^ 2 1 l{ui=o} an d F||( u ) = (u\ V u 2 ) 1 denote the Pareto Levy copulas correspond- 
ing to independence and to perfect positive dependence, respectively. 

(iv) (Partial derivatives) Tx(ux,0) = — u^ 2 and Ix (1(1,00) = 0. For fixed u 2 G (0, 00), 
the partial derivative Fi(u%, u 2 ) exists for almost all u\ G (0, 00) and for such u\ and 

> f i(ui, u 2 ) > -u^ 2 . 

Similarly, T2(0,M2) = — u 2 2 > L(oo,U2) = and for each u\ G (0, 00) the partial 
derivative ^(iti,^) exists for almost all u 2 G (0, 00) with 

> f 2 (ui, u 2 ) > -u 2 2 . 

Furthermore, the mappings u 2 i-> Pi (ui,u 2 ) and u\ 1— > Ti(u\,u 2 ) are defined and 
non- decreasing almost everywhere. 



Proof. Observing r(u) = C(l/u\, l/u 2 ) with the Levy copula C assertion (i) follows 
from Lemma 3.2 in Kallsen and Tankov ( 2006[). Assertion (ii) follows from the fact that 
r is 2-increasing and grounded by Definition |2.2| The lower bound in (iii) is obvious. By 
Theorem 5.1 in Kallsen and Tankov (2006) we have T(u) = lim^o t Ct(t/ui, t/u 2 ) for 
some (ordinary) copulas Ct '■ [0, l] 2 — > [0,1]. It is well-known that every copula is bounded 
above by the Frechet-Hoeffding bound M(u) = u\ A u 2 , whence setting Ct = M for all 
t yields assertion (iii). Regarding (iv) we only consider IV The assertion is obvious for 
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U2 € {0, 00} . Monotonicity of u\ h-» r(ui, U2) for each ii 2 proves existence of Y\{u\, U2) < 
for almost all u\ 6 (0, 00) and all u 2 6 (0, 00). Moreover, for each such u\,U2 by Lipschitz 
continuity, 



\T 1 (u 1 ,u 2 )\ = lim 

t— >o 



r(ui + i,u 2 ) - r(ni,n 2 ) 



< lim 



l/iux + t) - l/ Ul 



t 



U7 



Finally, fix V2 < 112 and consider u\ 1— > T(ui, V2)— r(ui, u 2 ). This mapping is non-increasing 
according to part (ii), and hence its first derivative ri(ui,u 2 ) — ri(m,u 2 ) exists almost 
everywhere and is non-positive. This proves the final assertion. □ 



3 Estimation of bivariate tail integrals and Pareto Levy cop- 
ulas 

In the following we are interested in the construction of an estimator T for T which is 



based on relation (2.5) and empirical versions of the tail integrals U, TJ\ and 1/2- Such 



estimators have for instance been discussed in Figueroa-Lopez (2008) in the univariate 
setting, and we will transfer them naturally to the bivariate case. 

Before we introduce these empirical versions, is turns out to be convenient to change 
the domain of U slightly. Since by assumption no negative jumps are involved, we have 

Z/([£l,Oo] X [0,00]) = Z/([xi,Oo] X [—00,00]) 

for each x\ > 0, and similarly for the second component. Therefore it is equally well 
possible to define U in the same way as before, but as a function U : M — )■ R, where 

M = (0, oo] 2 U ({-00} x (0, 00]) U ((0, 00] x {-00}) . 

Note that U corresponds on the stripes through —00 to the marginal tail integrals U\ and 
U2, respectively. 

Our estimator for the function U will be defined on H as well, and precisely we set 

1 n 

U n (x) = 1 {A«xm>x 1 ,A"X < - 2) >X2}> X = x 2), (3.1) 

where k n = nA n and AjX^ = -X^ n — -^-(f-i)A denotes the j-th. increment of 

i = 1,2. Having the role of the stripes through —00 in mind, we obtain empirical versions 

of the univariate tail integrals through 



1 n 

U n> i(xi) = U n (xi,-oo) = — ^2hAnx&>xi}> a;i € (0,oo], (3.2) 

fin ■ , 3 

and analogously for U n 2- Weak convergence of U n in an appropriate function space is 



established in Proposition 4.2 below. 
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The underlying idea behind U n is rather natural, given the interpretation of U as 
the average number of jumps of a certain size during the unit interval. Stationarity 
and indepedence of increments of a Levy process ensure that the same behaviour is to 
be expected over intervals of arbitrary size, as long as U is standardized accordingly. 
Therefore a canonical idea is to count joint large increments of and X^ 2 \ as they 
indicate joint large jumps over the corresponding time interval, and this is precisely what 
U n does. Note that in order for U n to be consistent, it is necessary to be in the high- 
frequency setting with infinite time horizon, that is k n — >• oo. On each fixed time interval 
[0, T] there are only finitely many jumps larger than a given size, which is clearly not 
sufficient to draw inference on the entire distribution of the jumps. 

In order to construct an empirical version of ( |2.5[ ) a notion of a generalized inverse 
function is of importance. For any / : (0, oo] —¥ [0, oo) which is monotonically decreasing, 
left-continuous and satisfies /(oo) = we define / _ : (0, oo] — > [0, oo) via 

/-(2) = inf{a;>0 | /(*)<*}. (3.3) 



Definition 3.1 Let U be the tail integral of a bivariate Levy process with positive jumps 



and U\, U2 be its marginal tail integrals. Using their empirical versions (3.1) and (3.2) 
we define the empirical Pareto Levy copula as 



u 



U n [U-M/ Ul ),U-Jl/u 2 ) 



u 



(«1,« 2 ) G [0,oo] 2 \{(0,0)}, 



(3.4) 



where U . is the generalized inverse function ofU n i as defined in (3.3), with the convention 
that [/^(l/oo) = U~ ri (0) = 00 and where a = al 
Finally, we set U n (—oo, —00) = n/k n . 



{a>0} 



ool{ a=0 } for some a G [0, 00]. 



Remark 3.2 In order to understand why a has to be introduced, suppose that we are 
infested in estimating T(ui,0) (even though it is known to take the value 1/ui). Our 
estimator becomes U n (U~ , —00) then, which is in general close to 1/ui due to the 

definition of U n> i. On the other hand, if we forget about a, we obtain U n (U~ -^{1 / u\) , 0) 
which only counts those increments of X where the first component exceeds U~ 1 {\/u\) and 
the second one is non-negative. Due to the existence of a Brownian part in X, however, 
we cannot expect these two estimators to be close, since a number of increments in the 
second component is indeed negative and thus this estimator is considerably small than 
I>i,0). □ 



Remark 3.3 In the general case of arbitrary jumps a similar construction allows the 



estimation of V in the interior of each of the four quadrants separately. Indeed, Eder and 



Kliippelberg] 020121 ) give a general notion of tail integrals and Pareto Levy copulas in their 



Definition 4, and from Sklar's theorem in this context (which is their Theorem 1) we know 
that the same relation as (2.5) holds for u G (M\{0}) 2 and determines V uniquely. For the 
sake of brevity we dispense with the entire theory in this setting. □ 
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4 Results on weak convergence 

Our aim in this section is to prove a result on weak convergence of the estimator T n , but as 
a by-product we obtain such a claim for U n as well. Before we come to the main theorem, 
let us briefly resume our assumptions on v which mostly have already been given in the 
previous paragraphs. 



Assumption 4.1 Let X be a bivariate Levy process with the representation (1.1). The 
following assumptions on v are in order: 

(i) v has support on [0, oo) 2 \{(0, 0)}. 

(ii) On this set it takes the form v(du) = s(u)du for a positive Levy density s which 
satisfies 

sup (|s(u)| + || Vs(u) ||) < oo 

ugM, 

for any n G (0, oo) 2 ; where 

M v = (77, oo) 2 U ({0} x (ry, oo)) U ((77, 00) x {0}) 

and Vs denotes the gradient of s on (rj, oo) 2 and the univariate derivative on the 
stripes through 0, respectively. 

(in) v has infinite activity, that is z^([0,oo) x |0,oo)) = 00. 



Assumption 4.1 (ii) had not been stated previously. It is used to prove a second order 



condition regarding the difference between U and the expectation of U n for which we 



generalize a result due to Figueroa-Lopez and Houdre (2009) from the univariate setting 



to the multidimensional case. Continuity and (strict) monotonicity of the marginal tail 
integrals as claimed before are obvious consequences of it. 

We begin with a result on weak convergence of U n , and to this end we have to define 
the function space on which the asymptotics take place. Let Boo(M) be the space of all 
functions / : H — > M which are bounded on any subset of HI that is bounded away from 
the origin and from the points (—00, 0) and (0, —00). We consider the metric inducing the 
topology of uniform convergence on those subsets, defined by 

00 

d(f,g) = Y,2~ k (\\f-g\\ Tk M), 
k=l 

where T k = [1/k, oo] 2 U({-oo} x [1/k, oo])U([l/fc, 00] x{-oo}) and \\f\\ Tk = sup ueTfc |/(u)|. 
This space is a complete metric space, and a sequence converges in Hoo(lHI), if and only if 
it converges uniformly on each T^. 
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Proposition 4.2 Assume that X is a Levy process satisfying Assumption 4-1 If the 
observation scheme meets the conditions 

A n —7-0, k n —> oo, Vk n A n -»• 0, (4.1) 

then we have 

7n (x) = y/k^{U n (x) - U(x)} ^ B(x) 

m (Z?oo(EI), d), where M is a tight, centered Gaussian process with covariance 

E [B(x)B(y)] = C/(x V y) = E/(xi V y x ,x x V y 2 ). 

T/ie sample paths of B are uniformly continuous on each Tf* with respect to the pseudo 
distance 



p(x,y)=E (B(x)-B(y))' 



1/2 



{£/(x) + C/(y) - 2£/(x V y)} 1 / 2 = |[/( x ) _ £/( y )|^ . 



,1/2 



For the proof of Proposition 4.2 the following lemma is extremely useful. Its univariate 



version is a special case of a more general result in Figueroa-Lopez and Houdre (2009). 



Lemma 4.3 Suppose that Assumption 4-1 holds and let 5 > be fixed. Then there exist 
constants K = K(5) and to = to (5) such that the uniform bound 



P(X t (1) > x l ,X^ > > x 2 ) -tv{[x x ,oo) x [x a ,co)) 



,(2) 



< Kt 2 



holds for all x = (x±, X2) G [5, oo] 2 U ({—00} x [5, 00]) U ([5, 00] x {—00}) and < t < to. 
Before we come to the result on T, let us introduce an oracle estimator for T. We set 



r n (u) = u n (U{\i/ Ul ),u 2 1 (i/u 2 )) 



u 



( Ul ,u 2 ) G [0,oo] 2 \{(0,0)}, 



(4.2) 



which means that we replace the inverses of the empirical marginal tail integrals by the un- 



observable true ones. Thanks to Proposition 4.2 we obtain weak convergence of a restricted 
version of this intermediate estimator in the space i3oo((0, oo] 2 ) of all real functions on 
(0,oo] 2 that are bounded on sets which are bounded away from the origin. In a similar sprit 
as before, we equip this space with the metric d(f,g) = YltLi ^ k (11/ ~~ dlWk A 1), where 
T k = [l/A;,oo] 2 . Setting x = (J/f^l/m), U 2 1 (l/u 2 )) and observing that Ur 1 (k) > k' > 0, 
the continuous mapping theorem immediately yields the following result. 



Corollary 4.4 Under the conditions of Proposition 4-% we have 

5 n (u) = ^[k~ n (f n (u) - r(u)) ^ B (C/f^l/m), U 2 \l/ U2 )) 



in (Soo((0, 00] ),d) with B as defined in Proposition 4-2 
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From a statistical point of view there is no loss in information when estimating T(u) 
on (0,oo] 2 instead of the entire domain [0, oo] 2 \ {(0,0)}, since a Pareto Levy copula is 
grounded by definition and thus known on stripes through 0. This remark remains valid 
for the final result of this section as well, which is on weak convergence of the estimator 
f„(u). 



Theorem 4.5 Assume that X is a Levy process satisfying Assumption 4-1 ■ If (4-D holds, 
then we have 



a n (u) = VK (T n (u) - r(u)J G(u) 

in (#00 ((0, oo] 2 ), d). Here the process G is defined as 

G(u) = G(u) + u\ f i(u) G(m, -00) + u\ f 2 (u) G(-oo, u 2 ), (4.3) 
where G denotes a tight centered Gaussian field on M with covariance structure 



E 



u)G(v) = T(uV v) = r(ti! Vui,ti 2 Vjj 2 ) 



using the convention T(u, — 00) = T(— 00, it) = 1/u. The sample paths of G are uniformly 
continuous on each Tk with respect to the pseudo distance 



p(u,v)=E 



u 



1/2 



|r(u)-r( v ; 



,1/2 



If both coordinates of u are distinct from 00, then fj(u) exists as a consequence of 



(2.5) and Assumption 4.1, and G(u) is well-defined. On the other hand, if one of the 
components equals 00, we hav e G( u) = almost surely; and also ri(ui,oo) = and 
1^2(00, U2) = from Proposition 2.3 Hence, the right hand side of (4.3) is well-defined as 



well, and we have G(u) = almost surely in this case. 



Proof of Lemma 4.3, For main parts the proof is almost similar to the one of the 



result in Figueroa-Lopez and Houdre ( 2009 ) which is why we will only give the main steps 
and restrict ourselves to the genuine bivariate case of xi, x-i 7^ —00. First, let e < {5/2 A 1) 
and pick a smooth function c £ : M 2 — > M satisfying 



lr 



-e/2,e/2]( 



U 



< cJu) < 1 



[-6,6] 



u 



Here and throughout the proof, || • || denotes the Euklidian norm on M 2 . We also de- 
fine the function c e via c e (u) = 1 — c e (u). It is straightforward to see that there exist 
independent processes X e and X e such that X ~ X e + X e and where X £ is a com- 
pound Poisson process with intensity X £ = J c £ (u)u(du) and jump distribution f e (du) = 
c e {xy)v{d\i) / \ £ and X £ is a Levy process with triplet (b e , £, c e (u)z/(du)), where we set 
b e = b - / l { || u ||<i}uc £ (u)i/((iu). 

Since our result is a distributional one only, it is possible to work with this particular 
representation of X in the following. Call Nf the number of jumps of X e up to time t. 
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Define /(u) = 1{ U > X } in a componentwise sense. Using the law of total expectation we 
then have 



E[/(x t )] = >ry 



k=0 



k\ 



E[f(X t )\N t £ = k}= e-^*E[/(X|)] + e- A ^A £ tE[/(Xf + fr)] 



fc=2 



where the are i.i.d. ~ f e . As noted before, we may proceed similarly to Figueroa-Lopez 



and Houdre (2009) now: Using their equation (3.3), the condition e < 5/2 ensures the 



existence of K and to > both depending on 5 only, such that 



e -A«*E[/(Xf)] < P(X t {£,1) > <J) < Kt 2 
for all < t < to, where X^' 1 ' denotes the first component of X e . Also, 



-Aet 



fc=2 



It therefore remains to focus on E[/(Xf + The distribution of £i is s(u)c e (u)du/\ £ , 
and as a consequence of Assumption |4.1| (ii) it follows that 

g(u) = E[/(u + &)] = P(ui + eS X) > xi, u 2 + eS 2) > x 2 ) 

is twice continuously differentiable with bounded derivatives. Using independence of X e 
and £i, it is sufficient to discuss E[g(Xf)], for which we can use Ito formula now: For 
arbitrary Y we have 



Y t ) = g(Y ) + I Vg(Y s ^)dY s + \ 9ij{ Y s-)d[Y\ Y>] 



l<ij<2 



+ Y, (g(Y s ) - g(Y s _) -Vg(Y s ^)AY s )) , (4.4) 



0<s<t 



where the quadratic covariation [Y l , Y 3 ] c s becomes E^s in case of a Levy process and AY S 
is the jump size at time s. Also, Vg and g%j denote the gradient and the corresponding 
partial derivatives of g. Plugging in X e for Y we discuss each of the four summands above 
separately: first, u > x implies ||u|| > ||x|| > 5 > e, and thus 



<7(X§) = g(0) =P(^ 1} > xi, > x 2 



;(2) 



1 

a7 



l{u>x}s(u)c e (u)du 



1 { u >x}s( u )^ u = yv([xi,°°) x [x 2 ,oo)). 



Second, the Levy triplet of X e is (b e , £, c e (u)z/(du)). From e < 1 we conclude that A e 
does not admit jumps larger than 1, and therefore dX. £ s consists of three summands, of 
which two correspond to martingales. Therefore 



E 



< [ \E[Vg(X. e s _)]b £ \ds < Kt 
Jo 
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due to boundedness of the first derivatives of g. We may proceed similarly for the third 
term in (|4.4|), whereas conditioning on gives 



E [g(X e a ) - g(Xl_) - V 5 (X^)AX^] = 

0<s<t 

jf J E\g(XU + u ) " s( x *-) - Vg(X £ s _)u}c £ (u)u(du)ds 

for the final quantity. Multidimensional Taylor formula proves that the inner integrand 
above may be bounded by iT||u|| 2 . Since v is a Levy measure, we obtain 

1 



E[g(X 



A, 



-u([xi,oo) x [x2,oo)) < Kt. 



From |1 — exp(— X e t)\ < Kt for < t < to the conclusion follows. 



□ 



Proof of Proposition 4.2\ Before we begin with the proof, note that due to Theorem 
1.6.1 in van der Vaart and Wellner (2007) weak convergence in £>oo((0, oo] 2 ) is equivalent 



to weak convergence on each £°°(Tk), which is the space of all bounded functions on Tj- 
endowed with the uniform norm. Therefore it is possible to fix one such T/. throughout 
the rest of the proof. 

Let us introduce some additional notation. We define a class of functions T n = {f n ,x '■ 
x 6 Tfc} via 

/n,x(p) = a/ n/k n (l{p>x>(0,0)} + l{pi>£i,a;2=-oo} + ^{p 1 >x 2 ,x 1 =-oo}) ■ 

Furthermore, we set 

n 

7 n (x) = yfa(U n (x) - Ep7 n (x)]) = n- 1 / 2 ]T (/„,*( A?X) - E[/ n , x (A"X)]) . 

i=i 



A consequence of Lemma 4.3 is that it is sufficient to discuss weak convergence of 7 n (x) 
only. Indeed, let xeTj., Then by stationarity of increments of X and using kn — ttA^ we 
have 

E[U n (x)} - C/(x) = A^P (ApfW > X1 ,£1XW > X2 ) -v([ Xl ,oc) x [x 2 ,oo)). 



This quantity is bounded by KA n due to Lemma 4.3 so the growth condition yfk n A n — > 



ensures that vfc^(7 n (x) — 7 n (x)) is uniformly small on each fixed T^. 

In order to prove 7 n (x) — > B(x) on £°°(Tk) we will employ Theorem 11.20 in 



Kosorok 



(2008) for which several intermediate results have to be shown. To begin with, set 



F n (p) = vV^n^peTfc}, 

which is a sequence of integrable (with respect to any probability measure) envelopes. 
The first two steps are related to the class of functions J- n . We start with the proof of an 
entropy condition, namely 

limsupsup / JlogN (e\\F n \\Q )2 , J r n,L 2 {Q))de < oo, (4.5) 

n— >oo Q JO 
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where N denotes the covering number of the set J- n and the supremum runs over all 
probability measures Q with finite support such that H-F^Hg^ = (/ F%{p)dQ{p)) 1 ^ 2 > 0. 
Thanks to the special form of J- n , this result is a simple consequence of Lemma 11.21 in 



Kosorok (2008): it suffices to check that each T n is a VC-class with VC-index 5. This 
follows from the fact that each finite subset of EI of size 5 has either a subset of 3 elements 
in [0, oo] 2 \{(0, 0)}, or a subset of two elements in one of the stripes through — oo. In neither 
of the cases theses subsets can be shattered by the sets deduced from the indicators in the 
definition of / n , x - 

The second condition to check is that J- n is almost measurable Suslin, and it follows 
from Lemma 11.15 and the discussion on page 224 in Kosorok (2008) that it is sufficient 
to prove separability of T n , that is the existence of a countable subset T n ^ of T& such that 

P*( sup inf |/„, X (A"X) - /„, y (A?X)| > 0) = 0. 
Here, P* denotes the outer expectation, since measurability of the event within the brackets 

— 2 

is not ensured. Set T n ^ = Tj.nQ . Then for each u and each x € Tfc, there exists a 
y 6 T n k such that / n>x (A^X(o;)) = / n>y (A"X(w)), since the / njX are indicator functions. 
This proves separability of T n . 

The remaining steps regard the behaviour of the variances and covariances of the f n x 
and their envelopes. We have 



hm E[7 n (x)7 n (y)] = lim E[/„, x (A"X)/ n , y (A?X)] = U(x V y) 



(4.6) 



as well as 



lim E[F 2 (A"X)] < U(l/k, -oo) + U(-oo, Ilk) 

n — Vrvn ** 



and 



lim E[F 2 (A?X)l {Fn(A „ x)>£ ^ } ] < lim E[F 2 (A"X)](e v / ^ 



,-1 



Finally, as in (4.6) we have for x, y G T^ that 
Pn(x,y) 



E 



(/ n , x (A?X)-/ n , y (A?X))' 



(C/(x) + C/(y)-2[/(xVy)) 



1/2 



1/2 



p(*,y) 



and due to Lemma 4.3 the convergence holds uniformly as well. This completes the proof. 



□ 



Proof of Theorem gjB| Let £^((0, oo] 2 ) C £oo((0,oo] 2 ) and B^((0,oo]) C 
£>oo((0,oo]) denote the space of all tail integrals of bivariate Levy measures concentrated 
on the first quadrant or of univariate Levy measures concentrated on (0, oo], respec- 
tively. Consider the mapping $ : fi^((0,oo] 2 ) x (B^((0, oo])) 2 ->■ #oo((0, oo] 2 ), defined by 
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cj> = $ 3 o $ 2 ° $1 with 

*! <((0,oo] 2 ) x (£&((0,oo])) 2 -+ £&((0,oo] 2 ) x (*U(0,oo])) 2 

$ 2 <((0,oo] 2 ) x (fi-((0,oo])) 2 ^((0,oo] 2 ) x (^([0,oo))) 2 
(C/ ) F 1 ,V2)^(C/,F 1 oP,y 2 oP) 

$ 3 <((0,oo] 2 ) x (^([0,oo))) 2 ^*W(0,oo] 2 ) 
(Z7, Gi, G 2 ) i-> U(Gi,G 2 ), 

where P(x) = 1/x and where, in the last step, Gj(oo) = oo. Moreover, £>^((0,oo]) C 
iBoo((0, oo]) and Bbo([0, oo)) C 0oo([O, oo)) are defined as the images of the associated 
function spaces under the respective mappings. Set also T n> i(x) = U n (Ui(l/x), — oo) and 
r n ,2(^) = U n (— oo, U 2 (l/x)). The proof will now basically consist of two steps. We start 
with discussing weak convergence of 



(4.7) 



'K ($(r n ,r n)1 ,r nj2 )-$(r,p,p)^ 

whereas this result is transferred to the original claim later on. 



Let us begin with the proof of (4.7). This assertion follows from the functional delta 
method in topological vector spaces, see van der Vaart and Wellner (1996), if we prove 
first that 



kn \ {^ni Tn,!; ^n,2) 



(r,P,P)} 



j, G(-, — oo), G(— oo, •)) 



in Soo((0,oo] 2 ) x (jBoo((0, oo])) 2 and second that $ is Hadamard-differentiable at (r, P, P) 
tangentially to suitable subspaces with derivative 



(4.t 



*{r,p,P)(U, U u U 2 )j (u) = U(u) + u\ r x (u) U x {u x ) + u\ T 2 (u) U 2 (u 2 ), 



where the summands involving the partial derivatives on the right-hand side are defined as 
if one of the coordinates of u equals oo. The first claim follows easily from Proposition 



4.2 and the continuous mapping theorem. Regarding the second assertion we need to 



clarify the metrics on the corresponding spaces. The canonical definitions are 

oo 

d(f, g ) = ^- k (\\f-9\\T k M), 



k=l 



where = [l//c,oo] 2 in case of i3oo((0,oo] 2 ), while T k = [l/fc,oo] and T k = [0, k] 
for iB oo ((0,oo]) and £>oo([0, oo)), respectively. Unfortunately, the mapping $i is not 
Hadamard-differentiable with respect to these metrics (see the proof of Lemma 7.2 be- 
low for details), whence we need to consider the weaker modifications 



d 2 (f,g) = J2 2 ~ k (\\f-9\\s k Al), 



k=l 
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where Sk = ([l/k,k] U {oo}) in case of £«,(((), oo] 2 ), while S k = [l/k,k] U {00} and 
Sk = {0}U [l/k, k] for Boo((0, 00]) and Boo([0, 00)), respectively. With these modifications, 
it follows from Lemma I7JJ and the chain rule that 



<J? : (£&((0,oo]V) x (^((0,oc]),d) 2 -> (^((0, oo] 2 , d 2 ) 



is Hadamard-differentiable at (T,P,P) with derivative as specified in (4.8) tangentially to 



D = {(U, Ui, U 2 ) G C((0, oo] 2 ) x (C((0, oo])) 2 | CL-(oo) = 0, lim x 2 l/,(x) = 0}. (4.9) 

x— >0 

Here, C((0,oo] 2 ) and C((0, 00]) denote the set of all functions on (0,oo] 2 and (0, 00] that 
are continuous with respect to the pseudo metrics p(u, v) = |T(u) — r(v)| 1 / 2 and p(u, v) = 
|l/u— l/i^ 1 / 2 , respectively. Hence, observing (G, G(-, —00), G(— 00, •)) G Do j the functional 
delta method yields 



in (B oo ((0,oo} 2 ),d 2 ). 



k n $(r n ,r ni i,r n)2 )-$(r,p,p) 



We will use the approximation Theorem 4.2 in Billingsley (1968), adapted to the 



concept of weak convergence in the sense of Hoffmann-j0rgensen, to transfer this result 
to weak convergence in (£°°([r), oo] 2 ), || • ||oo) for all 17 > and hence in (£>oo((0, oo] 2 ), d). 
To this end, define 



w n (u) = VK(<s>(r )-$(r,p,p))(u) 



and 



W, 



n,M 



» = v^(*(rn,r n ,i,r n , 2 ) - *(r, p,p) )(u)i {u 



{ue[7),i/] 2 }- 



Then W n ,Af(u) — > Gju(u) := G(u)l/ ue r^ jM ] 2 } for n — ^ 00 and Gm(u) 
M —t- 00 in (^°°([?7, oo]) 2 , || • Hoc), and it remains to prove that 



(u) for 



lim lim sup P* sup | y/k n (<f> (T n , r n ,i, T nj2 ) - *(T, P, P))(u)| > £ ) = 0. 

M->oo „,^oo \iti>Moru 2 >M 

Noting that <3?(r, P, P) = T the probability can be bounded by 



sup WK^n - r)(u)| > e 

«l>M/2 or u 2 >M/2 



+ P* (^3u with ui > M or u 2 > M : T~ o P(«j) < M/2, i = 1, 2 

The Portmanteau Theorem impl ies t hat the lim sup of the first probability converges to 
for M — > 00 using Proposition 4.2 Furthermore, some thoughts reveal that F~-(z) = 



\/{Ui{U ni {z))) for all z > 0. Due to monotonicity of T ni o P, the second probability is 
bounded by P(T^oP(M) < M/2, i = 1, 2), which thus converges to for n — > 00 observing 
that f" o P(M) = M + o P (l). 
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In the final step we will prove ■\/h n (J^ n — T) — > G in each (£°°([rj, oo] 2 ), || • ||oo), for 
which we heavily rely on the fact that the same result holds for the statistic discussed 
above. A consequence of the identity T^^z) = I / (Ui(U~ ^z))) is that <J>(T n , r nj i, r n> 2)(u) 
and T n (u) coincide as long as [/"^(l/iij) 7^ for i = 1,2. By monotonicity it is therefore 
sufficient to prove that the probability of U'^l/r/) = becomes small, which is precisely 



^mP^(l/r ? ) = 0j=0. 

To this end, let Ni(n) denote the number of positive increments of X®. By definition 
of the generalized inverse function in (3.3) we have that U^^l/rf) = is equivalent to 
I/77 > Ni(n)/k n or Ni(n) < k n /rj. Furthermore, letting Mj(n) be the number of positive 

(i) (i) 

increments of the process we see that it is sufficient to prove 

lim F(Mi(n) < k n /n) = 0, 

n— >oo 

since X does not admit negative jumps. Note that we have 

P (a^Z® > 0) = P (a^B® > -a,A n ) = P (N > -aiAy 2 ) = ± + o(l), 

where N is a standard Gaussian variable. Let n be large enough in order for the probability 
above to be larger than 1/3. For such n we conclude easily that 

P (Mi(n) < k n / V ) < P (Bin(n, 1/3) < k n /rj) 0, 



e.g., from Markov inequality and (4.1). This finishes the proof. □ 



5 Discussion and simulations 

5.1 An asymptotic comparison 

Suppose a statistician has knowledge of the marginal tail integrals. In this case, the results 
in Section [4] provide two competitive asymptotically unbiased estimators for the Pareto 
Levy copula, namely the oracle estimator T„ exploiting knowledge of the marginals and 
the empirical Pareto Levy copula r n ignoring this additional information. The following 
proposition gives a partial answer to the question of which estimator is (asymptotically) 
preferable. Perhaps surprisingly, ignoring the additional knowledge decreases the asymp- 
totic variance under certain growth conditions on T. A similar observation has recently 



been made in the context of copula estimation, see Genest and Segers (2010) 



Proposition 5.1 Suppose that the Pareto Levy copula T has continuous first order partial 
derivatives and that the functions 

v r / \ T{ui,u 2 ) , . T(u 1 ,u 2 ) , . 

u\ ^ u 1 T(u 1 ,u 2 ) = —, -r-, u 2 H> u 2 T(ui,u 2 ) = ^— (5.1) 

r<ui,o) r(o,u 2 ) 
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are non- decreasing for fixed U2 £ (0, oo] and u\ 6 (0, oo], respectively. Then the Gaussian 
fields G and G satisfy the inequality 

Cov{G(u),G(v)} < Cov{G(u),G(v)} 

for all u, v G (0,oo] 2 . Particularly, Var{G(u)} < Var{G(u)}. 

Proof. The proof is rather straightforward whence we restrict ourselves to the main 

Ai(u,v) is 



idea. We have C 


ov 


G(u),G(v)) - Cov(([ 


J(u),G(v)) 


= H= 


1 j4j where Ai 


defined as 












Ai = u\ti 




u?fi(v)l/(ui V«i) 


A 5 = 


nffi(u 


T(u\ V fi,t>2) 


A 2 = u\t ! 




V2r2(v)r(ui,u 2 ) 


^6 = 




r(ui, u 2 V f 2 ) 


A 3 = u 2 2 t 2 




uifi(v)r(wi,u 2 ) 


A 7 = 


*h 2 fi(v) 


T(ni V vi,u 2 ) 


A 4 = u 2 2 t 2 




U§T 2 (v) l/(«2 V t> 2 ) 


A 8 = 


w|r 2 (v) 


T(m,u 2 V u 2 ) 



The four summands on the left-hand side are non-negative, whereas the other four ones 
are non-positive. For symmetry reasons we may suppose u\ < v\. Distinguishing the 
two cases u 2 < v 2 and u 2 > v 2 some easy calculations (which frequently exploit condition 



(5.1)) show that A 5 + A 1 ,A 6 + A^Aj + + A 2 < in the first case, while A 5 + 



A 2 , A 6 + A 3 , A 7 + Ai, A 8 + A 4 < in the second case. □ 



Under the assumptions of Proposition 5.1 the condition in (5.1) is equivalent to 



uifi(u) + r(u)>o, n 2 r 2 (u) + r(u) > o 

for each u = (ui,u 2 ) 6 (0,oo] 2 , which is easily accessible for most parametric classes of 
Pareto Levy copulas. For instance, for the Clayton Pareto Levy copula given by 

we have 

ux f x(u) + r(u) = (4 + u e 2 )- l i e - l u e 2 , U2 r 2 (u) + r(u) = («? + ul)- 1 ' 6 - 1 ^ 

which is readily seen to be non-negative. In Figure[T]we depict the graph of the asymptotic 
relative efficiency 

r l2 r , Var{G(u)} 

0,2 2 0,oo),ui4 n 

Var{G(u)} 

of the oracle estimator F n to the empirical Pareto Levy copula T n for u G [0,2] 2 . The 
Clayton parameter is chosen as 9 = 0.5. Close to the axis the relative efficiency decreases 
to 0, while the maximal relative efficiency is attained on the diagonal with a value of 
21/32 w 0.656. Even in this best case, the difference is seen to be substantial. 
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Figure 1: The graph of the asymptotic relative efficiency ofT n to T n for the Clayton Pareto 
Levy copula with 9 = 0.5. 



5.2 Simulation study 

In order to obtain an impression on the performance of the asymptotic results stated in 
the previous section we will discuss some finite sample properties concerning Proposition 



4.2 and Theorem 4.5 In both cases, the setting is as follows: We simulate (essentially) 
two 1/2 stable subordinators, i.e., both tail integrals are given by Ui(x) = (7rx) _1//2 , which 
are coupled by a Clayton Pareto Levy copula with 9 = 1/2. Sometimes we add two 
independent Brownian motions with variance 1/2 each, sometimes we assume to observe 
the pure jump processes only. Throughout the study we use n = 22, 500 observations and 
run the simulation 500 times each. 

What differs from setting to setting is the choice of k n , or, equivalentl y, of A n . Recall 

—i/2 

that the rate of convergence is k n (which in light of the results in Figueroa-Lopez 



and Houdre (2009) appears to be a natural one in the context of estimating the Levy 



measure). Hence, a larger k n suggests a better performance of the normal approximation, 



whereas Lemma 4.3 indicates that the magnitude of the bias grows with k n as well. Both 
intuitive properties are visible from the simulation study provided in the following and 
from additional results which we do not show for the sake of brevity. 

Despite the fact that we have proven weak convergence of our estimators in certain 
function spaces we restrict ourselves to an analysis of the finite dimensional properties of 



our estimators. Let us begin with the asympotics in Proposition 4.2 for which we estimate 
U(x, x) for x = 2, 1, 0.5. Table[l] gives estimated bias and (co) variance for different choices 
of k n . Note that we have Cov(JB(x), B(y)) = (32vr) _1 / 2 0.0997 whenever x or y equals 
(2,2), whereas Cov(B(x), B(y)) = (IQ-k)' 1 / 2 « 0.1410 if the "larger" vector is (1,1) and 
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x,y 


2,2 


1,1 


0.5,0.5 


2,0.5 


2,1 


1,0.5 


k n 


bias 


var 


bias 


var 


bias 


var 


cov 


cov 


cov 


50 


-0.0106 


0.1007 


-0.0077 


0.1400 


0.0023 


0.1915 


0.0988 


0.0978 


0.1376 


75 


-0.0330 


0.0972 


-0.0229 


0.1453 


-0.0395 


0.1956 


0.1015 


0.1001 


0.1435 


100 


0.0168 


0.1021 


0.0223 


0.1375 


0.0341 


0.1893 


0.0996 


0.0927 


0.1300 


150 


0.0037 


0.1061 


0.0154 


0.1480 


0.0470 


0.2180 


0.1073 


0.1106 


0.1531 


50 


-0.0281 


0.0893 


-0.0120 


0.1208 


-0.0042 


0.1863 


0.0840 


0.0854 


0.1233 


75 


0.0252 


0.0949 


0.0115 


0.1187 


0.0226 


0.1861 


0.0861 


0.0894 


0.1216 


100 


0.0126 


0.0922 


0.0043 


0.1320 


0.0401 


0.1940 


0.0933 


0.0932 


0.1323 


150 


-0.0085 


0.0929 


-0.0127 


0.1337 


0.0277 


0.1991 


0.0931 


0.0962 


0.1371 



Table 1: Empirical bias and (co)variances of y/h^(U n {x) — U (x)) for various choices ofk n . 
Upper four lines: Pure subordinator; lower four lines: Subordinator + Brownian Motion. 



finally Var(B(x)) = (Svr)^ 1 / 2 « 0.1995 for x = (0.5,0.5). 

Generally, the theoretical (co)variances are well reproduced in both situations, even 
though the results look probably a bit better in the first four lines. This is of course 
no surprise, since additional Brownian increments make it harder to infer on the jump 
measure. In order to assess how well the normal approximation works apart from bias and 
variance, Figure [2] gives QQ-plots for the medium choice of k n = 75. These plots confirm 
that the finite sample properties are indeed satisfying, despite the discrete nature of the 
test statistic which simply counts exceedances of certain levels and is rescaled afterwards. 

Let us come to the estimation of the Pareto Levy copula. We proceed in the same way 
as before and discuss convergence of the finite dimensional distributions only. For simplic- 
ity, we estimate T(x, x) for x = 2, 1, 0.5 again, but these are of course different quantities 
now. In this case, the variances compute to Var(G(x)) = 21/(128a;), which becomes ap- 
proximately 0.0820 for x = 2, 0.1641 for x = 1, and 0.3281 for x = 0.5. Also, for x > y 
we have Cov(G(x), G(y)) = 7/32(l/x - T{x,y)). Therefore Cov(G(2), G(0.5)) » 0.0608, 
Cov(G(2),G(l)) « 0.0718, and Cov(G(l), G(0.5)) « 0.1437. We state their empirical 
versions in Tabled 

In this case the growth in bias for larger k n is clearly visible, and we also have a larger 
bias when estimating r(0.5,0.5). Overall, however, the results are satisfying again, and 
we see from the QQ-plot in Figure [3] that the normal approximation works very well for 
k n = 75, no matter if a Brownian motion is added or not. 

6 Conclusions 

In this paper we have investigated the problem of estimating both the bivariate Levy 
measure and the (Pareto) Levy copula in a nonparametric way. Our estimators are based 
on counting joint large increments of a bivariate Levy process, and in both cases we were 
able to prove weak convergence in appropriate function spaces. At least two natural 
extensions of our work are of interest for future research. 
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Figure 2: QQ-plots of the empirical quantiles of y/kn(U n (x) — U (x)) divided by their sample 
standard deviation vs. the theoretical quantiles of the standard normal distribution. Upper 
three pictures: Pure subordinator; lower three pictures: Subordinator + Brownian Motion. 



First, on the observational side several robustness issues could be discussed: The prime 
question in this context is: How realistic are observations of a bivariate Levy process at 
synchronous times and equally spaced, if we are faced with real data? The simplest 
extension probably is to introduce estimators in case where the observation intervals are 
not of equal size. Then we stay in the context of independent increments, for which theory 
of weak convergence is established as well; see e.g. Kosorok (2008). If both univariate 
processes are observed at different times, the situation is less clear. It might be promising 
to follow the approach due to Hayashi and Yoshida (2005 ) for diffusion processes then, but 
mathematics appear to be tough. Finally, one could move to bivariate ltd semimartingales 
for which both the Brownian part and the Levy measure depend on a time index and 
estimate local versions of v and related quantities. 

From a statistical point of view it might be interesting to construct several nonpara- 
metric tests concerning the dependence structure of a multivariate Levy process. This 
could include estimation of certain functionals of T or U as well as tests for independence 
or tests for a parametric form of these functions. For this reason, it would be important to 
establish a thorough theory concerning (Pareto) Levy copulas which relates functionals of 
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x,y 


2,2 


1,1 


0.5,0.5 


2,0.5 


2,1 


1,0.5 


k n 


bias 


var 


bias 


var 


bias 


var 


cov 


cov 


cov 


50 


0.0141 


0.0827 


0.0455 


0.1740 


0.0863 


0.3520 


0.0777 


0.0668 


0.1599 


75 


-0.0082 


0.0874 


0.0173 


0.1653 


0.1252 


0.3428 


0.0740 


0.0690 


0.1459 


100 


0.0502 


0.0783 


0.0894 


0.1708 


0.1748 


0.3400 


0.0685 


0.0508 


0.1547 


150 


0.0356 


0.0862 


0.1182 


0.1646 


0.3176 


0.3324 


0.0744 


0.0698 


0.1421 


50 


0.0345 


0.0790 


0.0560 


0.1637 


0.1021 


0.3263 


0.0699 


0.0639 


0.1389 


75 


0.0091 


0.0886 


0.0753 


0.1760 


0.1522 


0.3508 


0.0832 


0.0729 


0.1522 


100 


0.0312 


0.0745 


0.0776 


0.1530 


0.1480 


0.3033 


0.0610 


0.0558 


0.1305 


150 


0.0284 


0.0866 


0.0988 


0.1694 


0.2074 


0.3337 


0.0746 


0.0725 


0.1486 



Table 2: Empirical bias and (co)variances of Vk(T n (x) — T(x)) for various choices ofk n . 
Upper four lines: Pure subordinator; lower four lines: Subordinator + Brownian Motion. 

r to certain dependence properties, as in the case of ordinary copulas for which standard 
measures such as Kendall's r or Spearman's p can be written as integrals over C and are 
thus accessible through nonparametric estimation of the copula. 



7 Auxiliary results 

Lemma 7.1 Let P : (0,oo] — > [0, oo) denote the function P(x) = 1/x. 
a) The mapping 

$1 :(^((0,oo] 2 ),d) x (^((0,oo]),(i) 2 (^((0,oo] 2 ),d) x (£" ((0, oo]), d 2 ) 2 
defined by <E>i(t/, Ui, U 2 ) = (U, is Hadamard-differentiable at (T,P,P) tan- 



gentially to Dq 



So as defined in (4.9) with derivative 

-^TT /I /„, \ 27 



*'i,(r,p,F)( u > u u u 2) = (U,x^ 2 U 1 (l/x 1 ),x 2 2 U 2 (l/x 2 )). 

b) The mapping 

$ 2 :(*&(((), oo] 2 ),d) x (£-((0,oo]),d 2 ) 2 -> (^((0,oo] 2 ),d) x (^([0, oo)), d 2 ) 2 

defined by ® 2 (U, V\, V 2 ) = (U, V\ o P, V 2 o P) is Hadamard-differentiable at (T, P, P) 
tangentially to Do, 2 = ^\ (rpp)(^o,i) with derivative 

&2,(r,p,p)(U, V h V 2 ) = (U, Vi oP,V 2 oP). 

c) Suppose r has continuous first order partial derivatives on (0,oo) 2 . The mapping 

$ 3 :(fiSo((0,oo] 2 M) x (/3S o ([0,oo)),d 2 ) 2 ^/3 oo ((0,oo] 2 ,d 2 ) 



Nonparametric inference on Levy measures and copulas 23 



qq-plot, k=75, x=(2,2) qq-plot, k=75, x=(1,1) qq-plot, k=75, x=(1/2,1/2) 




-4 -2 2 4 -4 -2 2 4 -4 -2 2 4 

Theoretical Quantiles Theoretical Quantiles Theoretical Quantiles 



qq-plot, k=75, x=(2,2) qq-plot, k=75, x=(1,1) qq-plot, k=75, x=(1/2,1/2) 




-4 -2 2 4 -4 -2 2 4 -4 -2 2 4 

Theoretical Quantiles Theoretical Quantiles Theoretical Quantiles 

Figure 3: QQ-plots of the empirical quantiles of ^/k n (T n (x.) — T(x)) divided by their sample 
standard deviation vs. the theoretical quantiles of the standard normal distribution. Upper 
three pictures: Pure subordinator; lower three pictures: Subordinator + Brownian Motion. 



defined by <3?3(i7, Gi, G 2 ) = U{G\,G2) is Hadamard-differentiable at (T,id, id) tan- 
gentially to Dq,3 = 3> 2 (ppp)(^o,2) with derivative 



2 



$ 3,(r,id,id) 



.1 >■ 



3=1 



where the sum on the right-hand side is defined as if one of the coordinates of u 
equals oo. 



Proof. The assertion in a) is a consequence of Lemma [7 . 2| b elow , whereas the assertion 
in b) follows from linearity of $ 2 . Regarding c) let t n — > 0, (U n , G n i, G n 2) — > (U, G\, G%) G 
D ,3 such that (r + t n [/ n ,id [0jOo) +t„G n i,id [0jOo) +t n G n2 ) G Z^((0, oo] 2 ) x (^([0, oo))) 2 . 
First consider u G \ = k] 2 , which allows to decompose 

t r 7 1 {$ 3 (r + t n C/ n ,id [0 oo ) +t n G n i,id[ 0iOo) +t n G n i) - $ 3 (r,id,id)}(u) = L n i(u) + -L rt2 (u), 

(7.1) 
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where 

L„i(u) = t~ 1 {r(ui + t n G nl (u 1 ),u 2 + t n G n i(u 2 )) - T(u)}, 
^n2(u) = C/ n (^i + t n G nl (u 1 ),u 2 + t„G n i(n 2 )). 

f7 n converges uniformly on S 2 k,i = [l/(2fe), 2k] 2 to J7, which is uniformly continuous on 
S 2 k,l- Hence, since sup U3g[1/(2A , ) 2fc ] \t n G nj {uj)\ -)• 0, we obtain sup ue5fc i \L n2 (u)-U(u)\ 
0. It remains to consider the summand L n \. A Taylor expansion yields 

2 

L nl (u) = ^2tj(u)G n j(uj) + r n (u), 

3=1 

where the remainder term is given by 

2 

r n(u) = ^{fj(v n ) - fj(u) jGnjK) 

i=i 

with some intermediate point v n . Observing that |fj(u)| < uj 2 < k 2 , uniform con- 
vergence of G n j to Gj on [1 /k, k] implies that the dominating term in the expansion of 
L n \ converges to Yjj=i ^j( u )Gj( u j)i uniformly on Sk,i- By uniform continuity of tj on 
Ski and boundedness of G n j we obtain r n (u) = o(l), uniformly. The other cases, i.e., 
u G ([l/k,k] x {oo}) U ({oo} x [1/k, k]) U {(oo,oo)} are treated similarly, the details are 
omitted for the sake of brevity. □ 



Lemma 7.2 Let D^, C i3oo((0, oo]) consist of all functions f : (0,oo] —> [0,oo) that are 
non-increasing and left- continuous with /(oo) = 0. Recall (3.3) for the definition of the 
generalized inverse function. Then the mapping 

*:(DvM)^(£oo((0,oo]),d 2 ), f^f~ 

is Hadamard-differentiable at P(x) = x _1 tangentially to the space 

O = \h G C((0, oo]) | h(oo) = 0, lim x 2 h(x) = o) . 

with derivative (^' p (h))(x) = x~ 2 h(x~ 1 ). 

Proof. Let t n GR \ {0},h n G i3oo((0,oo]) and h G B such that t n -)■ 0,d(h n ,h) -)■ 
and P + t n h n G D^. It suffices to show that for each e G (0, 1), M G (1, oo) 



sup 

ze[e,M] 



(P + t n h n )-{z) 



z- 2 h(z- x ) 



0. 



For z G [e, M] set£ n (z) = (P+t n h n )~ (z). Choose no G N such that sup a;> ( 2M - ) -i \t n h n (x)\ < 
e/2 for all n > no. We begin the proof by showing that £, n (z) G [1/(2M), 2/e] for all n > no- 
By monotonicity of £ n we obtain 

Cn(z) < e„(e) < inf{x > 1/(2M) | 1/x + t n h n {x) < e} 
< mf{x > 1/(2M) | 1/x < e/2} = 2/e. 
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For the lower bound note that £ n (z) > £ n (M) and set xq = 1/(2M). Then 

(P + t n h n )(x ) = 2M + t n h n (l/{2M)) > 3M/2. 

By monotonicity of P + t n h n we obtain (P + t n h n )(x) > 3M/2 for all x < xo and hence 
Cn(Af) > x = 1/(2M) as asserted. 

By definition of the inverse, for e n (z) = t\l\ £, n {z), 

(P + t n h n )(£ n (z)) >z>eV{(P + t n h n )(£ n (z) + e n {z))} > 0. 
Some careful calculations convert the latter estimate into 

{s((, n + s n )} V {I + t n (£ n + e n )h n (£ n + e n )} ~ n z ~ 1 + t n £, n h n ((, n ) ' 

where we used the abbreviations £ n = £ n (2) and e n = e n (z). Since £ n (z) G [l/(2M),2/e], 
boundedness of h n on [1/M, 4/e] implies sup 2g [ e M ] l^n(-s) — - >■ 0. Dividing equation 
( |7.2[ ) by i„ and exploiting the facts that e n < and that x 2 h(x) is uniformly continuous 
on [l/(2M),4/e] the assertion follows. □ 
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